Skip to content

Conversation

@sayakpaul
Copy link
Member

@sayakpaul sayakpaul commented Jan 14, 2026

What does this PR do?

This PR is to assess if we can move to transformers main again for our CI. This will also help us migrate to transformers v5 successfully.

Notes

For the FAILED tests/pipelines/test_pipelines.py::DownloadTests::test_textual_inversion_unload - AttributeError: CLIPTokenizer has no attribute _added_tokens_decoder. Did you mean: 'added_tokens_decoder'? error, see this internal discussion.

@sayakpaul sayakpaul requested a review from DN6 January 14, 2026 09:23
- "tests/pipelines/test_pipelines_common.py"
- "tests/models/test_modeling_common.py"
- "examples/**/*.py"
- ".github/**.yml"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Temporary. For this PR.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@sayakpaul sayakpaul marked this pull request as draft January 15, 2026 12:05
@sayakpaul sayakpaul changed the title switch to transformers main again. [main] switch to transformers main again. Jan 15, 2026
@sayakpaul sayakpaul changed the title [main] switch to transformers main again. [wip] switch to transformers main again. Jan 15, 2026
logger.addHandler(stream_handler)


@unittest.skipIf(is_transformers_version(">=", "4.57.5"), "Size mismatch")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

torch.nn.ConvTranspose2d,
torch.nn.ConvTranspose3d,
torch.nn.Linear,
torch.nn.Embedding,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happening because of the way weight loading is done in v5.

Comment on lines +23 to +25
model = AutoModel.from_pretrained(
"hf-internal-testing/tiny-stable-diffusion-torch", subfolder="text_encoder", use_safetensors=False
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +281 to +283
input_ids = (
input_ids["input_ids"] if not isinstance(input_ids, list) and "input_ids" in input_ids else input_ids
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inputs = {
"prompt": "dance monkey",
"negative_prompt": "",
"negative_prompt": "bad",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise, the corresponding tokenizer outputs:

negative_prompt=[' ']
prompt=[' ']
text_input_ids=tensor([], size=(1, 0), dtype=torch.int64)

which leads to:

E       RuntimeError: cannot reshape tensor of 0 elements into shape [1, 0, -1, 8] because the unspecified dimension size -1 can be any value and is ambiguous

Comment on lines +569 to +580
if is_transformers_version("<=", "4.58.0"):
token = tokenizer._added_tokens_decoder[token_id]
tokenizer._added_tokens_decoder[last_special_token_id + key_id] = token
del tokenizer._added_tokens_decoder[token_id]
elif is_transformers_version(">", "4.58.0"):
token = tokenizer.added_tokens_decoder[token_id]
tokenizer.added_tokens_decoder[last_special_token_id + key_id] = token
del tokenizer.added_tokens_decoder[token_id]
if is_transformers_version("<=", "4.58.0"):
tokenizer._added_tokens_encoder[token.content] = last_special_token_id + key_id
elif is_transformers_version(">", "4.58.0"):
tokenizer.added_tokens_encoder[token.content] = last_special_token_id + key_id
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still doesn't solve the following issue:

FAILED tests/pipelines/test_pipelines.py::DownloadTests::test_textual_inversion_unload - AttributeError: CLIPTokenizer has no attribute _added_tokens_decoder. Did you mean: 'added_tokens_decoder'?

Internal thread: https://huggingface.slack.com/archives/C014N4749J9/p1768536480412119

@sayakpaul
Copy link
Member Author

@DN6 I have fixed a majority of the issues that were being caused by v5 / main. But the main culprits (at least for the tests we're checking in this PR) seem to be the assertions on expected values. Of course, the obvious fix would be to either reduce tolerance or change the expected slices. But do you have any other suggestions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants